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ABSTRACT 

The parameter estimation methods considered in this 
thesis are the weighted Least-Squares and Weighted Huber 
for some non-linear growth models. The properties of 
these parameter estimators derived from simulated data 
by means of (1) weighted and unweighted least-squares 
and (2) weighted and unweighted Huber robust estimation 
are compared. The error components of the simulated data 
are long-tailed and non-normal. The performance of mis-spe- 


cified models is considered. 
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I. INTRODUCTION 


The properties of the least-squares estimator for the 
linear model are well known [Ref. 2]. Among its important 
properties is that it has minimum variance among the class 
of linear unbiased estimators. In Chapter II the properties 


-of the least-squares solution for gnA in the growth model 


-B/t, E(t.) 


Y(t; ) = Ae e t, = Lerner 4) 


will be reviewed. Then the estimate A and its statistical 
properties will be derived using the estimator for QnA by 
means of the Taylor series approximation. Thus the emphasis 
will be on estimation of the "final value" of the growth 
process. 

The least-squares estimator is easy to compute and has 
desirable properties when the assumption of normality is 
justified. However, many of the properties of the least- 
squares estimators are not robust against non-normality as- 
sumptions (Sheffé, [3]). 

Recently (Huber , bal) statisticians have shown consider- 
able interest in finding estimators which are robust against 
Aoneiomality of error assumptions. Examples of non-normal 
distributions of interest are long-tailed distributions such 
as a mixture of two normals, and also the double exponential 
and Cauchy. Several robust methods have been considered 


such as: 





1. Using the median, 

Z. Jackknifing, 

3. Trimmed mean, 

4. Huber estimator. 

Of these methods only the Huber will be considered in 
Chapter III. One major difficulty with the Huber estimation 
method is that the solution process leads to a system of 
non-linear equations which cannot, in general, be solved 
analytically. Although some asymptotic properties of the 
Huber estimator have been derived, the small sample proper- 
ties must generally be obtained by computer simulation. 
| Finally, a problem is said to be misspecified if the 


data comes from the model 


xy 


1 = &(%5€,4) (1-1) 
but the model used is 


x 


> = £(x,€,) (1-2) 
where the functions f and g are not ‘identical. 

Since in practice the true model is not known, it is 
important to know how good are the fits or estimates ob- 
tained by using reasonable alternative models. In Chapter 
IV the properties of several mis-specified model fits are 
compared when the true model is 


Bt. E(t.) 


Y(t.) = A Cote. e 


for A = 1 and Cee ee mange — 0.20/72 Using computer 


Simulation. The reason for choosing B = 0.0772 is that for 
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this value of B, the largest value of E(Y) for the true model 


is 0.6, i.e. 60% of the final value (A = 1) eventually reached. 
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II. LEAST-SQUARES ESTIMATION OF &£nA AND TAYLOR SERIES 


APPROXIMATION OF A 


A. LEAST-SQUARES ESTIMATION OF &nA 


Consider the model 


y = Ae B/t Q€ | (2-1) 


where t is the independent variable, Y is the dependent Vari- 
able, and € a random variable. 

Let the n pairs of sample observations of Y and t be 
(ty, Vy)» (ty Yo), ct* , Ct,» ¥,)- 


Assume. the hypothesis 


E(e;) = 0 (ee he on 
0 i # j 
E(ej€,) = roe (2-2) 


The distribution of &nA and the approximate distribution of 
A may be found by taking logarithms of equation (2-1), 1.e. 
in Yo= invA byt + €. (2-3) 


The properties of gnA can be derived by considering the 
linear model 


Z=art Bx tu (2-4) 


where 
Z= £n Y, a = $n A, B = -B, x = 1l/t, u= ee. 


Assume the following hypothesis for the linear model 


Zs ce B x, + U, floes. SN 
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E(u,) = 0 i= 1,...,n 


E(u,u;) = | | (2-5) 


2 
ou 


where a, 8 and oF are unknown parameters. The principle of 
least-squares is to choose estimates @ and 8 of a and 8B such 
n Dae a 
‘that .2, ut is minimized. 
p— 1 


The estimates a@ and 8 are given by 


a (x5 -x) (25-2) ae 
21 (x5 -x)° 
a=2z-868x 
where 
oe 5 By aa 
Th 
z= 5k 44 


from these expressions it can be shown that 


IE (a) 7 


gl 
- ify *i 
Var(a) = —+=+—._ a? (en 
) <2 
ny (0 


B. TAYLOR SERIES APPROXIMATION FOR THE LEAST-SQUARES 
ESTIMATOR A 


Let X be a random variable and Y = £f(X), where it is 
assumed that the function f can be expanded in a Taylor series 


about E(X) = yu; that is, 


Ss 





f£(X) = £(u) + (X-wf'(y) + 0((x-n)?). 
Neglecting the higher order terms 


E( f(y) + (X-u)£'() ) 
= f£(u) 
Wer(2O0) = BE COQ. > WETS SF 
= E( (£(X) - f(u))? ) 
= E( (X-1)? )°(£"(u))?. (2-9) 


2 


E(£(X)) 


From (2-3) 


z=’ Y= Rn A - B/t tre 


and from (2-7) 


n 
| A i21 %] 
VEE (CS oS 
sy 
n.2, (x5 x) 
Using the relationship A = pe and the results of (2-8) 
and (2-9) 
u = E(X) = gndA 
E (A) = eb 
var(A) = (e°™*)-var(enA) (2-10) 
¥ xe 
ae i a Se 
n Ee 


: Sa! 
Nsf4 (xX; x) 


Equation (2-10) shows that using the least-squares method 
Pitmele wlayromesertes approximation the variance of A is 


directly proportional to o?. 
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As an example, 


font = 'o2,,.-,20, A = 1, and B = 10.0 


Var(A) = 0.0841*Var(e) 


ES 





III. THE HUBER ESTIMATION METHOD 


A. THE HUBER ESTIMATOR WITH SCALE UNKNOWN 

Although least-squares gives the best linear (in the 
observations) estimates of the parameters in (2-3) or (2-4), 
non-linear estimates may be appropriate when the error 
-terms (e's) are Waneaeaicd: as is often true in practice. 
Various principles for deriving estimates are possible. In 


this thesis an estimator due to Huber [Ref. 1] is utilized. 


Prob {fy < Y < y + dy } 


p (X*) Say Get 


where bé @+tae 


0 (z) is the density for e. 


The likelihood function for the observations is 


The log likelihood function 


: Vino 1 
Z, loglo( + aie J 


L= . 
J 
n - von 
= .2, € log 9 ( +~——) - log a ]. 
jo a 
Consider y, - 6 
n p'( ~——) .. 
dL _ os a ( ay ) 
06 0)=— oj =1 y. - 0 a 
p( +) | 
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= Ci ae) 
p'( —_—_.) y;, - 8 
a : ja 7a Q ot = yd 
5 ( —=— ) 


To find the maximum of L where 8 and a are unknown set 3dL/3980 


and dL/da equal to zero. 


Vee id 
n po’ ( +_— ) 
je 6” a 
oe + ) 
Vie 0 
Re gete pt( __ ) 
1 5 ( mics A) ee 
n jal a WSS, 
pont 1 ) 


In the Huber method with scale unknown, let 


ue z 4 
Sa p(z) (55)) 





for some function w to be chosen. 


The equations can be written in the form 


n eg Se 
jy v( ae )= 0 
(3-4) 
wes ; : 
n jz v¥¢ + — J tH )=1 
Example 1: 
Let p(z) = —_— e 2/2 
427 


where p is the standard normal distribution. Then 


p'(z) = -z —— e 


LZ 





The likelihood equations can be written in the form 


n Ve a0 
ale er es 
aes, oo 
ion ( i cane eC Ee )=1 
n jz1 a a 


which simplifies to 


and 


D> 
| 


_ iI 
a 381 7; 
7 1 ae - 82 
a = n j21 (Y; 6) 


These estimates are obtained by using 


See ae = : 
Y(z) omy a (325) 
Thus the estimates with the normal distribution assumption 
come from the use of w(z) = z. 
Example 2: 


Let op be Cauchy. Then 


il 
Oil -z)) = TLL + z2!] and 


¥(z) 


il 
I 
Ae) 
N 
a 
+ 
N 


Forms of w that are generally used 


HUBER M 
-ka Zeke 
w(z) = Z -ka < z< ka (3-6) 
+ka z > ka 
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SINE 
| sin( $2) = |z| < kn 
v(z)= | (3-7) 
0 otherwise. | 
TUKEY 
Kale 7/7 a) zie, Ka 
y(z)= | (3-8) 
0 |z| > ka 
The parameter a must be selected with reference to a 


scale parameter for e€. 
B. ROBUST ESTIMATION OF PARAMETERS 
Consider the linear model 


- = art Me 0X) Fes ieee Leryn eras 6 
Y; B ( j ) j j 


E(e,) = 0 
B(e;e;) = 0 1 7 j 
E(e=) = o:. 


Let the scale factor be a. The likelihood function for the 


observations is 


7 n Veupetes sR (= xX) 1 
= qf oO ( A ye= 
we 1 + 
where x = = .Z, xX. 
Phe 2) 


Differentiating with respect to a, 8, and a and equating 


each derivative to zero 
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IL 2 n 0! d _! 7 
a8 = j=1 ( QO ) ( a ) ( ) 


n t Vea =e (Xe ee" x) 
gee 52, C(-1)( & )+¢ 1, 4+ —_) - $1 = 0 


q2 





Let 
me Ona Z)) 
yw(z) = - Sm) and 
ee ee 


then the equations (3-9) can be written in the form 


n T. 

ioe, o 

nh T. ee 

j2y YC tC; - x) = 0 (3-10) 
n ‘or I . 1 7 

up Gare) Gao) ot sO 


A method of iteration for solving the system of non-linear 
equations (3-10) will now be described. First, start with 


initial values 


a(0), B(0), a(0). 


Re-arrange equations (3-10): the first 
ee 
n Yr. n w( +) 
Ee = pete 7 = 
j#1 y ( ei ) 0 becomes 52a — ; 0 
(+4) 


and the second equation 
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v ( 


tims 


f: 
jz v( = Ox; - x) = 0 becomes a 
| ( 


and the third equation 


Nh T . n rT. 
a Ly ( — j( —+ )- =] = 0 is now a = = Ey w( =a eee 
me 4-K ) 
YC A} ) 
oF 00 LY. - o&(k+1)-8 (k+1) (x5 -x) J = 0 (3-11) 
( Sraape 
r. (k) 
n wv 


521 varie (x; -X)E yj -a(k+1) -B(k+1) (x;-x)] = 5 (3-12) 


( “a0K 


a(k+1) = = Ey = ae r, (k+l). Bens 
From equations (3-11) and (3-12) expressions for a(kt+l) and 
B(k+1) may be obtained. These can be substituted into equa- 
tion (3-13) to obtain a(k+1). With a reasonably good maees 
for a(0), B(0) and a(0) a few iterations should provide 
Sure Le1ently accurate solutions of the system of equations 
fo, 10). 

Initial values for the parameters a(0), 8(0) and a(0) 


may be obtained by least squares regression. 


C. PROPERTIES OF THE HUBER ESTIMATOR 
Consider the model Y = Ae B/t Qe. 
Simulations were carried out with the above model for 


were 1.25.58, ,.20. TO investigaté the properties 
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of the Huber estimation when errors are long-tailed a "wild 
shot'' distribution with density function p(x) given by 
1 -x*/2k* 1 eee acess 


e *(q) + 
k Y27 ke3 





Cons ae ake wt 


y 


was used. Four values of q were used: q = 0.9, 0.825, 0.75, 
0.6. The values of k were then chosen so that the distribution 
p(x) had a variance in the seeoNIE 0.05 to 2.0. For each set 
of parameter values, the simulation was replicated 100 times 
to obtain 100 values of A from which the sample mean of i\ 
and the sample variance were obtained. The sample distribu- 
tion of A was tested for normality by means of a plotting sub- 
routine described in Appendix A. 

The properties of the Huber estimate A obtained from the 
simulation are 

1. The Huber estimate A has variance which seems propor- 
tional to Var(e). Examination of Table I, Appendix B, indi- 
cates that the variance of the Huber estimator is smaller 
than the variance of the least-squares estimator. 

2. The Huber estimate A has an approximate normal dis- 
tmniplieron for 0 < Var(e) < 0.5. 

3. For the "wild shot" distribution, which is symmetric, 
the Huber estimator séems unbiased. 
From Table I, Var(A) = 7.23 x 107°, A = 1.0056. 
For the hypothesis E(A) = 1.0 

A-1.0 
(SIS 10 100 


IJ 


should have an approximate t-distribution with 100 degrees 
of freedom. 


The t-test at 0.10 level of significance is 


Se 


A 
ct 


ules: 
/T.23x107 37100 ~ 289 


which may be rewritten as 


e a e A e 
ne ene ecole yee < A <i tt) 4. v7.coxl0 9/100 


ty gs i 25R0--/7 100 = 0.0146 


0.9854 < A < 1.0146. 


The simulated value of A of 1.0056 is within the 
confidence interval identified above. 


4. A good fit for the variance A against Var(e) 1s 
given by 
Var(A) = D Var(e) + 6°[Var(e)]? 
where 6 is a random variable which is approximately normal 
SOmwOmc varie jes O25 with E(6) = @. This is shown in Fig. 5 
where Var(A)/[Var(e)]* is plotted against 1/Var(e). 
S.. For values of q between 0.6 and 0.9, the approximate 


relationship between Var (A) and Var(e) is 


Var (A) G00) Var (e ).. 


This relationship is obtained from Table III, Ap- 
Pema: x B,by taking the arithmetic mean for q = 0.6, 0.75, 


voc osandelno ey ine reason for using the arithmetic mean of 


ZS 





the four slopes is that they are quite close to one another 
(as compared to the spread), and the number of replications 
(100) was not large enough to obtain a more accurate es- 
timate. In Chapter II the relationship for least-squares 


obtained from theory was 
Var(A) = 0.084 Var(e). 


Thus the Huber estimator has about 80 percent the variance 
of the least-squares model. 

6. For assumptions on the distribution of ¢€ which 
represent a severe departure from normality, such as the 
Cauchy, examination of Table III.2 shows that for the four 
values of the scale parameter used, the largest value of 
the variance of the Huber method for A is about 1 while that 


for the least-squares has a value of up to 10’. 
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IV. MISSPECIFICATION PROBLEM 


A. LEAST-SQUARES METHOD 

The theoretical values of the misspecification function 
problem when the fit is linear or log linear can be obtained 
fairly easily. First, assume that X4oXoo-++-,N, are locate. 


With E(x) = 0 and that the function F(X) »Xo9-+-X)) can be 
expanded in a Taylor series about the point (ly pono aldee 


_ of 
Let fi Aki a -- + Xp) = ax; (Xyo+--sX))- 
Then 
n 


n y : 
Sree ee nie eg Ce ME CM 2 ie 


Nn 


thus 


ECLE(x,,-..,x,)] = E(u s+ ot ) 


n 
and 


I2 


n 
Var(£) = ECC 52) (x;-nj)f;(uys--- uy) aa 


12 


Th 
2. 7 2 


12 


Nl 
2% 


n 
. 2 
Var (x) fot C Ei (Hys--- ot) 3 : 


I2 


Now, suppose the observed data is from the model 


ce = 
e J where c = 0.0772 


i ee 


but the model used to fit it is 


: €. 
Ver 3 0) 


ZS 





Then 


e. = £n i =) i iS ae 


n 
and least-squares demands minimization of j21° 7° 


The observed values of the Y; are ee Hence 


: 1 
nl .@) —= e“J) - &n A - B/j 


J 1+c*j 


i) 
il 


Qn ( icy ) + e. =e 9 9 UM segs Bale 


Defining L(A,B) as 


i 
im 


L(A,B) 


_c*j 
;2 Z,Cen( ees ) + e - £n A + B/j]?. 


Differentiating this function with respect to A and B and 


setting equal to zero results in 






aL _ | i ; 
oA A j= Z, Cen (=oud 7) a © 3 Rn A + B/j] 0 
aL _ 2, = (Pine See ) + cf - Sn A + B/j] ao fh 
0B ye eee J j ‘ 
The least-squares solution for &2n A is 
(4-2) 
Nh if n ; aoe : n 1 n 1 Be 
Ae (.2, zy) 2,Cei + &n EEE Coo eal 2s s[e,+inz1_] 
n n 5s 
Jk 1 il 
2 “a ie =) a =), — 


If the E, are assumed to be iid with E{e;) = 0 and Var(e€5) 


small, then the Taylor series approximation gives 
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(4-3) 


1 ss) 
esp *timecep 2 


ee ina) - ea oC: 


E(2nA) = 





For the variance of nA 


n n 
il 1 ‘| 
af egy o et G2) SS je FE F 
xx (0) = = (4-4) 
K AiG ak i ee ee J Ve 
Fe J? J=1 3 
Using (4-4) the variance for &£nA becomes 
Var(2nA)~= Var(X) E ( aos (0))? 
j=l OX, 
and : | (4-5) 
Var (A) x eELeMAl var (ena). 


1. Comparison of Theoretical and Simulated Results 


For observed data from the model — 


? 
Ge ej 
Itc*j * ss 


Y = 1.0 O20 7 Z 


ieee ere Var(e) 0.072 


The theoretical values are 


E(2nA) = - 0.5768 
E(A) = 0.5617 
 Var(£nA)= 6.055 x 10°. 


The observed values for computer simulation are 


(2nA)= - 0.580 
Var(2£nA)= 6.04 x 10°. 
Normality plots not included with this thesis indi- 


cated that 2£nA ‘was distributed approximately normal. Hence 
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gnA - (-0.5768) 
Y0.00604/100 


has an approximate t distribution with 100 degrees of freedom. 
The confidence interval at the 0.10 significance 


level is given by 


eee SnA - (0.5768) So ere 
° Y0.00604/100 i 


oT 


- 0.5901 < &nA < - 0.5635. 


The observed value from simulation was QnA = - 0.580 


which is well within this confidence interval. 


B. WEIGHTED HUBER ESTIMATOR 
1. Mode1 Y = Ae 2/teé 

Quite often weights tw} are given to the jth obser- 

vation in order to reduce bias or the mean square error when 


it 1S suspected that there is misspecification error in the 


model. The equations (3-10) are modified to read 


Nl 1Gr 

jl, ¥j ¥C g*) = 0 

n Te = 

521 vO ) Ox, - X) = 0 

Nn I. T . 1 
eG) Cate 1 = 0 


and the method of iteration can again be used to solve these 


systems of equations as before. 
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Cc 
2. Model Y = Ae B/t eo 


Since quite often better results can be obtained by 


modifying the original model by the introduction of an ad- 


-B/t ie 


ditional parameter the model Y = Ae was modified by to 


-B/t° « : : . : 
read Y = Ae e-. The following is a derivation of the 
weighted Huber and Least Squares estimators for this model. 
a. Weighted Huber Derivation 
The weighted likelihood function for the obser- 
vations is 
£ny. - &£nA + B/t> 
n 
Ww: 1 \w. 
jC eo ¢ —-———+) "5 (5% 
For the purpose of simplifying notation replace &nA by A 


and any, by Y5° Then the log weighted likelihood function 


is 
Aue Byte 


n 
L = j2il™; &np ( = 


) - Ws £nal. 


Differentiating L with respect to A, B, c and a and setting 


equal to zero results in 


n 
dL _ o' -1 _ 
JA j¥1 "5 aa: ) ¢ = aaa 
n 
dL roe 1 
sah, te ( 2 at 
Bs = aa at 
n = 11 2 
ee et Oy 
dc jal j le) +c a 
J 
9L 5 Vee A B/t’ Wy 
a jet; Ca). ~a2 bse Ie 
Lewy(z) = - 2-4 


8, 





Then the above equations can be written as 


Cc 
n WwW. Vouk + B/t 
= a. <A ee = 
p= 32, 3 vi . ) = 0 
J 
n Be y. - A+ B/tt 
ae j21 AC v a ae (4-6) 
J 
n WwW. &n t. Va eer B/t’ 
Y eS ee ) = 0 
ae pe 
J 
n n a Veo ant B/t> 
= a =- = 
s= a 925, ne jh w(y co” Jaye yy( + —___—_+. } 0 


The unknown parameters A, B, c and a are the 
solutions to the system of non-linear equations (4-6) which 
can be solved by Newton's method of iteration. To simplify 


notation let tj = Y5 Ss B/ty. Then 





3p _ -f 1 aa 
Bet 52a." =v" 4) 
ye) ees To eta ee 
as j#1 "5 Ae Y ( — ) 
j 
=2 ie bt 
ap. + B ' 
IC ji “ja Ce 
j 
Ip _° 2B B 1 cee 
sb GE. we (car) ¥'( —) 
J 
ae ow 1 r 
SA 7 gta ce Ca IC) 
J 
dB ice pe WC a) 
J J 





: -£n t. 
eee BC 5y( 1 
Cc jy=1 rc 8 tc 
J J 
oe pit 
EA . j%1 ae (-a2) yes a ) 
J 
ar n w.&nt. 1 TS 
BA 21 te Ca) 1) 
J 
pe et ea TF 
BB jel en at® eae 
J J 
n w.-&nt. = Scere 
st = oF 2 (ae, wre —L ) 
J t< a t* 
J J 
Te Ne OWeene: eat 
0a 21 +c (ae a ) 
J 
n T. z T . 
3S = -2, twi(-DvC 2) + wyr,¢ Zyvr¢ 41 
dS ake 1 “i , ian 
Shoe 521 Ear, ac AG ee wa ae p' ( yA 
. J 
+ n (-2 ) ve (= 27 Ga) as 
ge neg, Osel —__l- 2 Ses rs a wv" ( )] 
J J 
Is _ n n 5 ae Te 
Let z = Cone ez az) = (p.4,7,5)" 


T j= 
x = (X41 2X7 sXzoXy) = (An, Cy a): < 
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Then the iterations are given by 


A. = A. + AA. 


i+] 1 i 
eS ees rt 
Seng 2p G5 “aes 
Asa] = as + Aa 


where the A's are the solution of the equation 


= 


Q 
N 
m® 
N 
Q> 
N 
m® 
N 


1 1 1 1 
—— AA. P 
OX, 9X. OX 2 OX, 1 
dz. 92.4 92.4 IZ. 
5x, dx, 3x. 3x, 7 q 
1 2 5 4 
9Z 2 9 Z 2 92 2 IZ 2 
0X 0X 0X OX Oc; £ 
1 2 5 4 
OZ 4 OZ 4 OZ 4 OZ 4 
ox ox OX OX da; : 
it 2 3 4 
b. Weighted Least-Squares Derivation 


The weighted least-squares estimator for the model 


lead to the normal equations 


E_ wilh onA + 2] = 0 
= wien = kites | = 
Sa eo) ae Ne 
J 
nh 
= * - ee 
Ge 524 Ws x [eny, RnA + x J = 0 
J J 


a2 





r= Ss WwW. a ieaieny.)- LIA a 3 J=0 
Jm2 3 48 J J ee 
y J 
and the iterations are given by 
- gny a W . 
= n J 
te tic. ~ «(C&nA) Se, GAC oh cee 
ae =¢. - J i J= se! j=. “el 
tb il 
n Rnt 
ae: ere? W. ] 
1 jel J tics 
nh w.kny. n We n Ww. 
Cay, <2 LS (SWS ee a eee | 
: ey j=l teCsay ( Fi j=1 t.ci4y 1 j=l t.2c. 14 
i+] 1 1 n W . 
[ 2, pat. | 
(2nA). j=l (t5C544) 

n W.-&ny.k&nt. n wW-.&nt. n W. 
ee enn). 2 OB ee 
ieee 2 ier 1 j=l t.c. Leb j=1.- 2c 

(QnA) ; , = (nA) - 1+ alle 
: n w.£nt. 
! By Cec 
J j itl 


C. COMPUTER SIMULATION OF THE MISSPECIFIED MODELS 

The data analyzed was generated from the model Y = 1.0 TET 
Wiehe wbe= O02077 2Zewien vor(e) = 0.1, 0.2 and 0.3 initially. The 
value of B was chosen so that the expected values have a range 
PMomMmont7 to 0 O08 Ine values for t€ were from one to twenty. 
The sample moanweA and sample variance Var (A) were calculated 
for a sample size of 100. The weighting system chosen was 


: : eles we) 20-7 20-] 
CANORA jg ye 007 7° 2) 50.85 679-3) 


mroonde ry to 
find weights which show promising results so as to concentrate 


the analysis on the interesting cases. 
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1. Numerical Results 
Preliminary examination of Tables VI, IX, and X show 
that 
: _ -B/t* E 
/ 1. For the least-squares fit to Y = Ae e~ the 
expected value of A is practically independent of the system 


of weights chosen with E (A) approximately equal to 1.30, 


which is quite surprising. 


Ce EOT the same model i i= Ae / tee the least-squares 
method and the Huber method have approximately the same ex- 
pected value and variance for low weights. 

3. For heavy weighting systems, the bias of the 
Huber method is considerably less than the bias of the least- 
squares method. 

4. The iteration for the Huber method with the model 
Y = ne B/t ef for var(e) = 0.2 had some peculiar features. 
Specifically, 

a. About eighty to ninety percent of the simu- 
lations with different random number seeds did not result in 
convergence. 

b. Examination of Table V appears to show that 
the system of non-linear equations for the Huber method have 
multiple solutions. 

c. When the initial starting values for the 
Newton iteration are the convergent values of one of the 


sequences, but if a different seed is used, divergence of 


the iteration still usually occurred. 
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The probable cause of the divergence for the itera- 
tion of the Huber method for the mis-specified model is that 
the var(e) chosen was too large. One of the difficulties of 
the Huber method for the model Y = ne B/t of 1s chosing a 
suitable set of parameters (4) as the initial starting values 
for the iteration. Usually it is quite difficult to chose 
even three suitable initial values. 

As a result, it was decided to concentrate further 
analysis of the mis-specified problem using heaving weighting 


-B/t.e as 


systems with the Huber method for the model Y = Ae 
these gave the least bias. Also, as the initial value of 
var(e) used was too large to be practical, the values of 

var(e) was reduced to var(e) = 0.005, 0.01, and 0.02. The 


weighting systems chosen were w(j) = ee 4°, qe. 5*, SOE or de 


dae ess 

Examination of Tables VII and VIII shows that the sys- 
tem which weighted the later data values heavily and early 
data values very lightly still gave the best results. The 
weighting system given by w(j) = ;* has the smallest bias and 
mean square error although it had an estimator A with the 
largest variance. 

ee Ncdsiaua LE Curve EEE eane olnaals 

The following curve fitting experiments were performed 
for the Huber method and the least-squares method for the 
model Y = ae “es with data generated from the model Y = 1.0 


B-t/1+B-t)e~ with parameter values previously mentioned. Two 


einves were plotted in @ach figure. ihe curve which has the 
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smaller value of y for small values of t is the fitted curve. 
The second curve is the expected value of T for the data 
generated. The figures are: 

| 1. 6-13 var(e) = 0.005 

2. 14-21 var(e) = 0.010. 

The unweighted least-squares curves (Figures 6,14) 
were obviously bad fits. Comparison of the various figures 
show that in general the heavier weighting systems gave 
better fitted curves for large values of t and poorer fits 


for small values of t. 
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V. DISCUSSION 


A. JUSTIFICATION FOR HUBER METHOD 

For the currently specified model given by Y = Rg ey See 
with E(e) = 0, the Huber method has a slightly smaller vari- 
ance than least-squares for the random variable € consisting 
of a mixture of two normals. For Var(e) < 0.5, the variance 
of the Huber estimator A 1s approximately normal. Thus the 
Huber estimator of A is a better estimator than least-squares 
for long-tailed distributions. For non-symmetric distribu- 
tions of e« the least-squares and Huber methods are compar- 
able (see Table IV). 

Another possible objection to the least-squares method 


for the fit Y = Ae B/t ee 


is that very often it is not known 
what the true model is. The least-squares method does not 
seem robust against a mis-specified model, judging from our 
experiments. Thus the Huber method might well be preferred 
to the least-squares method when the true model is not known, 
judging from the experiments carried out to date. 


-B/t 


Another variation of the model Y = Ae is Y = A: 


e B/ttk.e. For the data Y = 1.0 (Bt/1+Bt)e~ was considered. 
This model did not appear promising for the following rea- 
sons: 

1. Comparing the plots for the various methods of fit 
for Y = he ee eee Shows that the fitted curves were too flat 
as compared to the ene Y = 1.0 (Bt/1+Bt) for values of t 


Satisfying 12 < t < 20. Since the parameter k was introduced 


on 





to match more closely the slopes of the fitted curve and the 
data for values of t between say 15 and 20 as well as the y 
values, this implies that k would have to be negative. How- 


- + a ae aver 
Ae YE AIS has an infinite discontinuity 


ever, the function y = 
at t-= -k. Thus values of k < -1 are not reasonable for the 

range of values of t considered above. One way of overcoming 
this difficulty would be to discard the observations for 

0 < t < 5, since the future values are less dependent on the 


distant past observations than those observations which are 


more recent. 


B. AREAS OF FURTHER STUDY 


1. Since the exact form of the growth is not known, 
further investigation in this area. For example, the model 


used to generate the data in this thesis was Y = As poe on 


However, there is no reason to believe that Y = Glee eyes 
Or any similar growth model could not have been used. 


2. The range of values for t was from one to twenty 


(1-20). Other ranges on t need to be investigated. 
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APPENDIX A: SUBROUTINE NMPLOT 


The subroutine NMPLOT was used as an indicator as to 


whether a set of n observations Xz oXq0000 9X, come from a 


normal distribution. To carry out the test, the numbers 


X1oXo0---2X, are sorted in increasing order say YyoYooree oY 


n 
am n 
The sample mean y = f. 321 ve) and the sample variance 


ee! Sa z Yj 
ae ity YY; y)* are then calculated. Let zs = Z 


If the original set Xzo+++sX, were normal, then for rather 


large n (at least 50) the set JE j=l1,...,n should have 
properties similar to that of the ordered statistics from a 
standard normal distribution. Let 6 be the cumulative dis- 
tribution function from the standard normal. Then a plot 
of j/n versus NCA), should lie approximately on a straight 
line passing through the origin with slope one. To decide 
whether the sample x 


»X. 1S approximately normal, a plot 


ree 


obtained from n observations from a standard normal and the 


Th 


two plots compared. The subroutine is part of the computer 


program for the weighted Huber method for the model Y = Ae. 
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Var(e) 


PRS, 


Slope 


Table III a 


q=0.6 

Var (A) 
0.0060 
0.0144 
0.0257 
O02 22 
0.0527 
0.0393 
0.0458 
0.0492 
0.0668 
0.0512 
0.0697 
0.0871 
0.0807 
0.0752 
0.1381 
0.1091 
0.0933 
0.0982 
0.1493 
0.1685 
0.0641 


Y=1..60e 


q=0.75 
Var (A) 
0.0058 
0.0118 
0.0254 
0.0197 
0.0181 
0.0273 
0.0489 
0.0468 
0.0515 
0.0876 
0.0552 
0.0427 
0.0588 
0.1080 
0.0789 
0.1060 
0.0860 
0.1700 
0.1330 
0.1730 
0.0593 


Variance for the Huber 


Distribution. 
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q=0.825 


Var (A) 
0.0072 
0.0145 


020755) 


0.0250 
0.0255 
0.0345 
0.0566 
0.0451 
0.0679 
0.0734 
0.0802 


0.0614 


MBE eis 
0.1017 
0.0888 
0.0858 
0.1008 
0.1071 
Ue arab 
0.2015 
O07 1 


q=0.9 

Var (A) 
0.0068 
0.0156 
0.0209 
0.0241 
0.0329 
0.0434 
0.0490 
0.0508 
0.0553 
0.0683 
0.0681 
0.0939 
0.0909 
0.1340 
0.1530 
0.1389 
0.1214 
0.1970 
0.2070 
0.1650 
0.070 


Method Wild Shot 
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E is mixture of two distributions 


with probability 0.9 


Zz 
1 
, with probability 0.1 
X is the standard normal 
HUBER LEAST SQUARES 
p A Var(A) A Var (A) 
Z, 0.0707 1.020 0.00100 1.021 0.00099 
02250. 90s Oat 1.029 0.00147 1.030 0.00147 
0.1414 1.030 0.00218 L2032Z 0.00215 
0.0707 107.5 0.00102 1.024 0.00101 
2,=0.25 0.10 1.030 0.00142 1.030 0.00141 


0.1414 1.020 0.00203 1.020 0.00201 


Table IV. Comparison of Least Squares and Huber Methods for 
Non-Symmetric Distributions. 
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Table V. Convergent Sequences for the Huber Method with 
Misspecified Model. 
_ ,,7B/t*_e€ a Peak: 
Y = Ae e with weights w(j) = 1 
Iteration (£nA) Be Ge a. 
1 1 1 

1 0.2000 3.000 0.5000 0.8000 
16 -0.1083 3.6682 0.7245 -18.179 
7 -0.1086 3.6678 0.7246 -18.072 
18 -0.1086 3.6678 0.7246 -17.965 
19 -0.1086 3.6678 0.7246 -17.858 
15 0.7872 3.1258 0.2924 9.006 
16 0.7969 3.1349 0.2911 8.226 
17 037971 S135) 0.2910 (BONO 
18 0.7971 o. 135 1 0.2911 6.649 
19 0.7971 S255! 0.2911 5.862 
20 0.7971 S21 55 1 0.2911 5.076 
iS 0.0912 2.6268 0.4214 0.8348 
16 0.0912 2.6268 0.4214 0.7991 
ilps 0.0912 2.6268 0.4214 0.76057 
18 0.0912 2.6268 0.4214 ' 0.7348 
19 © 0.0912 2.6268 0.4214 0.7064 

20 0.0912 2.6268 0.4212 0.6803 
15 0.3407 a OLaS 44 0:1975 
16 0.3346 See SS 0.4560 0.1929 
17 0.3360 BZ 07 0.4557 0.1939 
18 0.3356 Se 2505 0.4558 21956 
19 0555/7 Se 2504 0.4557 021937 
20 0.3357 Se2504 0.4558 0.1936 





w(j) 
1 
j 
lee 
j 
2 
j 
0.70295 
0.8579°J 
1 
j 
Ae 
j 
2 
j 
0.707973 
0.857973 
1 
j 
1.5 
j 
2 
ie 
0.702973 
0.857°°9 
Table VI. 


Y = Ae 


= 1.0 


Var(ce) 
0.10 


One 
0.10 
0.10 
Oe.) 
0.20 
OF 2,0 
Veale 
0.20 
0.20 
U0 
0250 
0.30 
0.30 


Bt 
1+Bt 


Weighted Huber 
-c/t t 


e® b 


Var (A) 
0.00041 
0.0055 
0.0123 
0.0220 
0.0414 
0.0038 
0.0154 
0.0149 
0.0325 
0.068 
0.127 
0.0138 
0.0209 
0.0183 
0.0344 
0.066 
0.199 


0.0134 
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5 Sos) 
Borer 
~697 
. 740 
.761 
~655 
.560 
.663 


78 
~839 © 
~674 
o3 70 
~645 
eo! 
~754 


.663 


Method with Mis-specified 


eCeEE 
186 
.107 
.104 
090 
098 
123 
.195 
.128 
.110 
.116 
153 
.120 
206 
144 
24 
127 
S22 


lay 


Model 





Be 


Y= 1.0 qe e B = 0.0772 q = 0. 
w(j) Var(e) Var (A) A M.S.E 
0.70 0.005 0.00333 BST, 0.0629 
0.85 0.005 0.00043 662 0.115 
7° 0.005 0.00056 710 0.0847 
;2 0.005 0.00118 749 0.0642 
0.70 0.01 0.00503 756 0.0646 
0.85 0.01 0.00087 664 0.114 
aoe 0.01 0.00121 718 0.0807 
3° 0.01 0.00154 750 0.0640 
0.70 0.02 0.0119 781 0.0600 
0.85 0.02 0.00117 659 0.117 
que 0.02 0.00253 707 0.0883 
5 0.02 0.00432 0.742 0.0709 
Table VII. Weighted Huber Method with Misspecified Model 
Y = Ae </* o&. 
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w(j) Var(e) 
j> 0.005 
cos 0.005 
fj? 0.01 
;* 0.01 
;° 0.02 
a 0.02 


Var (A) 
0.0033 
0.0043 
0.0035 
0.0077 
0.0092 
0.0179 


>>| 


7 95 
7555 
. 796 
-816 
SES 
839 


0453 
-0315 
~0451 
.0416 
US SS 
.0438 


‘Table VIII. Weighted Huber Method with Misspecified Model 
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pie) B= 0s0772 gq = 0.9 


Y= 1.0 pee 
w(j) Var(e) Var (A) A M.S.E. 
1 0.10 0.00430 1.30 0.094 
j 0.10 0.00415 1.33 0.113 
aoe 0.10 0.0067 130 0.097 
5? 0.10 0.0074 1.29 0.092 
ia70- 2 eto 0.00683 1.26 0.074 
mme5- 92 0.10 0.0030 1.30 0.093 
1 0.20 0.00896 1.28 0.087 
j 0.20 0.0126 _ 14 0.128 
ie ae 0.20 0.0146 1.30 0.105 
;° 0.20 0.0182 1.32 0.121 
ioe: 0.20 0.0154 1.25 0.078 
sce) O20 0.0140 Le 0.116 
1 0.30 0.0165 | 1.28 0.095 
j 0.30 0.0130 1.34 0.129 
oo 0.30 0.0179 1.31 0.114 
a2 50 0.0268 1.29 deta 
EO) omso 0.0361 BY 0.109 
seo 


0250 0.0139 eo 4 0.110 


Table IX. Least-Squares Method with Misspecified Model 


pas 
Y = Ae D/t” of 
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w(j) 


Table xX. 


Bt 


Y = 1.0 77H B 
Var(e) Var (A) 
0.10 0.00285 
0.10 0.00526 
0.10 0.00685 
0.10 O0152 
0.10 0.0194 
0.10 0.00677 
O20 0.00517 
0.20 0.0116 
0.20 0.0214 
0.20 0.0200 
0.20 0.0599 
0.20 ~ 0.0139 
0.30 0.00831 
0.30 0.0165 
0.30 0.0181 
0.30 0.0429 
0.30 0.0597 
0.30 0.0230 


0.0772 


>> | 
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654 


Oe © 


738 


sOSZ 


wo c 


~654 


e711 


694 


~741 


~658 


.270 


668 


075 


fou 


Ey ee 


.658 


0. 


M.S.E. 
0.191 
0.125 
0.114 
0.084 
0.079 
0.129 | 
0.188 
Dyes 
0.105 
0.114 
Oe 127, 
0.131 
0.193 
0.127- 


UsiZs 


Ores. 
0.134 


Least-Squares Method with Misspecified Model 


Y = Ae c/t e® 
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Figure 1. Comparison of Variances of Huber and Least-Squares 


Estimator, var(e) < 0.5. 
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Figure 2. Linear Fit for Variance of Huber Estimator for 
q = 0.9. 
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Figure 3. Linear Fit for Variance of Huber Estimator for 


q = 0.825. 
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Figure 4. Linear Fit for Variance of Huber Estimator, for 
q = Q.60. 
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Unweighted Least-Squares, Var(e) = 0.005. 
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Figure 6. 
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Figure 7. 


=0.6 


#d.5 


- 0. + 


© 0.3 





Unweighted Huber, Var(e) 
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0.005, A = 0.572. 
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Figure 8. Weighted Least-Squares, var(e) = 0.005, w(j)=j? 
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Figure 9. Weighted Huber, var(e) = 0.005, w(j) = j’, 
A = 0.773. 
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Figure 10. Weighted Least-Squares, var(e) = 0.005, w(j)=j° 
A = 0.797. | 
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Figure 11. Weighted Huber, var(e) = 0.005, w(j) = j’ 
A = 0.810. : 
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Figure 12. Weighted Least-Squares, var(e) =.0.005, w(j)=j’ 
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A = 0.828. 


- UY) 
Ss 


Figure 13. Weighted Huber, var(e) = 0.005, w(j) = j* 





Figure 14.Unweighted Least-Squares, var(e) = 0.10, A = 0.548, 
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Figure 15.UnWeighted Huber, var(e) = 
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0.010, A = 0.548. 
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Figure 16. Weighted Least-Squares, var(e) = 0.010, w(j) 
: = 3°, A = 064. 
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E@euresl/. | We1ghted Huber, var(e) = 0.010, w(j) = j’ 
A = 0.678. 
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Figure 18. Weighted 
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‘Figure 19. Weighted Huber, var(e) = 0.01, w(j) = j?, 
A = 0.694. 
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‘Figure 20. Weighted Least-Squares, var(e) = 0.010, w(j) 
r = J 3 A = 1; 019 1, 


Aran 








| 5 10 iS ; a0 25 


MR sect mte ots ee ee 6 om See Fa tr Ow we Bw ores a ; at . — ae age nates ; 





0.010, w(j) = j 


Figure 21. Weighted Huber, var(e) 
A = 0.691. 
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